When a conference involves three or more endpoints, the experience of joining the conference can be irritating for participants who are attempting to join the conference as well as people already joined to the conference. For example, some participants may have technical difficulties in joining the conference, particularly when using a traditional-hardware room system with a cumbersome user interface. Other participants may simply be late to the conference, and upon joining may interrupt an ongoing conference. For example, in some conferences, beeps and announcements disrupt the user experience of participants already joined to the conference. There is a need for approaches to improve the user experience when connecting multiple endpoints to a video conference by minimizing technical difficulties (e.g., from coping with different types of endpoints and the associated differing procedures for connecting them to a conference) and allowing coordinated connection to the conference. The present application discloses embodiments that address aspects of this need.